Sequence sensitivity of breathing dynamics in heteropolymer DNA 
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We study the fluctuation dynamics of localized denaturation bubbles in heteropolymer DNA with 
a master equation and complementary stochastic simulation based on novel DNA stability data. 
A significant dependence of opening probability and waiting time between bubble events on the 
local DNA sequence is revealed and quantified for a biological sequence of the T7 bacteriophage. 
Quantitative agreement with data from fluorescence correlation spectroscopy (FCS) is demonstrated. 
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The biological function of DNA largely relies on its 
physical properties: Protein binding is sensitive to local 
DNA structure 0, DNA looping facilitates the search 
of binding proteins for their specific site Q, and DNA 
knots impair transcription or act as barriers between dif- 
ferent genome regions Similarly, local denaturation 
of DNA is necessary for protein binding to DNA singlc- 
strandGLfl , and is implicated in transcription initi- 
ation [21, |3| ■ DNA melting has a long tradition in sta- 
tistical physics 0- Its biological relevance is due to the 
fact that the free energy for breaking a sin gle base pair 
(bp) at physiological temperature is ~ fcgT [lCllllj . Re- 
newed interest in DNA melting, from a physics perspec- 
tive is nourished by the possibility to measure the fluc- 
tuation dynamics of local denaturation bubbles by single 
molecule FCS [13. 

We present a master equation (ME) and complemen- 
tary stochastic simulation, that provides the time series 
of the bubble fluctuations. A full two-variable formu- 
lation in terms of bubble size m and left fork location 
xl allows to investigate an arbitrary sequence of bps, 
beyond previous homopolymer [{| and random energy 
models jl3|. In certain limits, the ME can be solved ana- 
lytically. We employ DNA stability data from a novel 
approach measuring the ten stacking interactions sep- 
arately and, inter alia predicting a distinct asymmetry 
between AT/ AT and AT/TA nearest neighbour bps [i"lf . 
As proved on recent FCS experimental data our model 
describes well the bubble dynamics with only one free 
parameter. We demonstrate the delicate sensitivity of 
bubble dynamics to the local sequence of heterogeneous 
DNA on the promoter sequence of the T7 bacteriophage, 
and illustrate good potential for nanosensor applications. 

Model. With typical experimental setups jl2j in mind, 
we consider a segment of double-stranded DNA with M 
internal bps, that are clamped at both ends (Fig. [I}. The 
full sequence of bps enters via the position-dependence 
of the statistical weights Uh\>(x) = exp{ehb(aO /[ksT]} for 
breaking the hydrogen-bonds of the bp at position x, and 
u s t(x) = cxp{e s t(x)/[fcsT]} for disrupting the stacking 
interactions between bps x—1 and x. Due to the high free 
energy barrier for bubble initiation (£ <C 1), opening and 
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FIG. 1: Clamped bubble domain with internal bps x = 1 to 
M, statistical weights Uhb(x), u st (x), and tag position xt- 



merging of multiple bubbles are rare events, such that 
a one-bubble description is appropriate. The positions 
xt, and xr of the zipper forks correspond to the right - 
and leftmost closed bp of the bubble, xl and xr are 
stochastic variables, whose time evolution in the energy 
landscape defined by the partition factor (m > 1) 



3?(xL,m) = 



i 



[ u hh (x) [ u st {x) (1) 

X=X L +1 X=X L +1 



characterizes the bubble dynamics. 2? is written in terms 
of xl and bubble size m — xr — xl — 1, with 3?(m = 
0) = 1. Here, £' = 2 C £, where £ m 1CT 3 is the ring 
factor for bubble initiation from Ref. 01 that is related 
to the cooperativity parameter cr w 10~ 5 0,0] by (T = 
£exp{e st } 0| . For the entropy loss on forming a closed 
polymer loop we assign the factor (1 + m)~ c and 
take c = 1.76 for the critical exponent [15|. Note that a 
bubble with m open bps requires breaking of m hydrogen 
bonds and m+1 stacking interactions. 

The zipper forks move stepwise xl/r — * xl/r ± 1 with 
rates t^, R (xL,m). We define for bubble size decrease 

tt(x L ,m)=t R \(x L ,m) = k/2 (m > 2) (2) 

for the two forks 0|. The rate k characterizes a single 
bp zipping. Its independence of x corresponds to the 
view that bp closure requires the diffusional encounter of 
the two bases and bond formation; as sterically AT and 
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FIG. 2: Scaling plot of At(xr,t) at various T for the sequence 
AT9 from [l2T |. Inset: Relaxation time spectrum. See text. 



GC bps are very similar, k should not significantly vary 
with bp stacking, k is the only adjustable parameter of 
our model, and has to be determined from experiment or 
future MD simulations. The factor 1/2 is introduced for 
consistency 0. Bubble size increase is controlled by 



t L (x L ,m) 
t^(x L ,m) 



ku st {xL)uhb(xL)s(m) /2, 

ku st (x R + l)u hh (x R )s{m)/2, (3) 



for m > 1, where s(m) = {(1 + m)/(2 + m)} c . Finally, 
bubble initiation and annihilation from and to the zero- 
bubble ground state, m = <-> 1 occur with rates 



t+(x L ) 
t^(x L ) 



= k£ s(0)u st {x L 
= k. 



l)u hb (x L + l)u st (x L + 2) 



(4) 



The rates t fulfill detailed balance conditions. The an- 
nihilation rate ^(xl) is twice the zipping rate of a sin- 
gle fork, since the last open bp can close either from 
the left or right. Due to the clamping, xl > and 
Xr < M + 1, ensured by reflecting conditions t£ (0, m) = 
t^(xL, M— xl) = 0. The rates t together with the bound- 
ary conditions fully determine the bubble dynamics. 

In the FCS experiment fluorescence occurs if the bps 
in a A-neighbourhood of the fluorophorc position xt are 
open Measured fluorescence time series thus corre- 
spond to the stochastic variable I(t), that takes the value 
1 if at least all bps in [xt — A, xt + A] are open, else it is 
0. The time averaged (~) fluorescence autocorrelation 



A t (x T ,t) =/(t)/(0) - J(t) 



(5) 



for the sequence AT9 from are rescaled in Fig. 

ME. DNA breathing is described by the probabil- 
ity distribution P(xL,m,t) to find a bubble of size m 
located at xl whose time evolution follows the ME 



dP(xL,m,t)/dt = WP(xL,m,t). The transfer ma- 
trix W incorporates the rates t. Detailed balance 
guarantees equilibration toward lim^oo P (xt,, m , t) = 
2?{x L ,m)/2f, with 3f = E Iiim %,m) UtiM The 
ME and the explicit construction of W are discussed at 
length in Refs. Eigenmode analysis and matrix 

diagonalization produces all quantities of interest such as 
the ensemble averaged autocorrelation function 



A(x T ,t) = (I(t)I(0))-((I)? 



(6) 



(I(t)I(0)) is proportional to the survival density that the 
bp is open at t and that it was open initially 0, llflj • 

In Fig. [2Jthe blue curve shows the predicted behaviour 
of A(xT,t), calculated for T = 49°C with the parame- 
ters from |ll| . As in the experiment we assumed that 
fiuorophore and quencher attach to bps xt and xt + 1, 
that both are required open to produce a fluorescence 
signal. From the scaling plot, we calibrate the zipping 
rate as k = 7.1 x 10 4 /s, in good agreement with the 
findings from Ref. 0, |2jj . The calculated behaviour re- 
produces the data within the error bars, while the model 
prediction at T = 35° C shows more pronounced devi- 
ation. Potential causes are destabilizing effects of the 
fiuorophore and quencher, and additional modes that 
broaden the decay of the autocorrelation. The latter is 
underlined by the fact that for lower temperatures the 
relaxation time distribution /(r), defined by A(xT,t) = 
J exp(— t/r)/(r)dr, becomes narrower (Fig. |2|insct). De- 
viations may also be associated with the correction for 
diffusional motion of the DNA construct, measured with- 
out quencher and neglecting contributions from internal 
dynamics |2lj . Indeed, the black curve shown in F ig. 
was obtained by a 3% reduction of the diffusion time [2^ 
see details in [l9| . 

Stochastic simulation. Based on the rates t, stochastic 
simulations give access to single bubble fluctuations 0, 
l23j |. Our customized Gillespie algorithm uses the joint 
probability density of waiting time r and path fi = +/— , 



P(t,[jl,v) = t^(x L ,m)exp -T^2t^(x L> m) 



(7) 



defining for given state (x l , m) after what time t the 
next step of fork v € {L, R} occurs. The formulation via 
the waiting time density v P is economical computa- 
tionally, avoiding a large number of unsuccessful opening 
attempts in traditional Langevin simulations. Using Q 
we obtain the single bubble time series in Fig. |3J 

Phage T7 analysis. By ME and stochastic simulation 
we investigate the promoter sequence of the T7 phage, 
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AGTATAGGGACAATGCTTAAGGTCGCTCTCTAGGAg-3 ' 
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(8) 
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FIG. 3: Time series I(t) for the T7 promoter, with xt = 38, 
41. Middle: Waiting time (ip{r)) and fluorescence time (4>(t)) 
densities. Bottom: Mean fluorescence time for A = 0. 



whose TATA motif is marked red @ . Fig. |2| shows the 
time series of I(t) at 37°C for the tag positions xt = 38 
in the core of TATA, and xt = 41 at the second GC bp 
after TATA: Bubble events occur much more frequently 
in TATA (AT/TA bps are particularly weak [llj]). This is 
quantified by the density of waiting times ?/>(t) in the / = 
state, whose characteristic time scale r 1 is more than 
an order of magnitude longer at xt = 41. In contrast, 
we observe similar behaviour for the density </>(t) in the 
1=1 state for xt = 38 and 41. Both iP(t) and 0(r) 
decay exponentially for long t; the overlaid lines represent 
numerical evaluation of the ME, see |l9l. As shown in 
the bottom for the parameters from yjj , the variation of 
the mean correlation time r corr = J A(xT,t)dt obtained 
from the ME is small for the entire sequence, consistent 
with the low sequence sensitivity of 4>{ T )- Note the even 
smaller variation predicted for the parameters of |lflj | . 

Fig. 0] shows the equilibrium probability that the bps 
[xt — A, xt + A] are open, as necessary for fluorescence 
to occur. We plot data obtained from the zeroth mode of 
the ME together with the time average from the Gillespie 
algorithm (GA), finding excellent agreement. Whereas 
for A = several segments show increased tendency to 
opening, for the case A = 2, one major peak is observed; 
the data from II ll coincide precisely with TATA, while 
the data from jlflj peak upstream. Also shown is a com- 
parison to the opening probability of a random sequence 
demonstrating that the enhanced opening probability at 
TATA is significant, compare [J^- Analysis for various 
A indicate best discrimination of the TATA sequence be- 
ing open for A = 2. For future FCS or energy transfer 
experiments, it therefore appears important to optimise 
the A-dependence for best resolution, e.g., by adj usting 
the linker lengths of fluorophore and quencher [25j. 

Nanosensing. Fig.[S]shows the dependence of the mean 



FIG. 4: Probability to have bps [it — A,xt + A] open. 



correlation time of the AT9 sequence on salt concentra- 
tion C and T. The variation with C and T is significant, 
pointing toward potential applications of DNA fluores- 
cence constructs as nanosensors [U The triangles de- 
note the melting concentration of infinitely long random 
AT and GC stretches, respectively (see The max- 

ima of the t coit curves hallmark the critical slowing down 
of the autocorrelation at the phase transition point be- 
yond which the bubble is preferentially open, see also |26j . 
Note that the maxima coincide with the melting concen- 
trations in the bottom panel. The dashed line (r max 2D) 
corresponds to the longest relaxation time obtained nu- 
merically from the ME; it agrees well with r corr close to 
the maximum, analogously for the other T . The horizon- 
tal line (r max ID) represents the longest relaxation time 
(2M+ l_) 2 /7T 2 fc _1 obtained from the homopolymer model 
of Ref. @ in the limit u ->■ 1, a -> and c = 1 (M = 27, 
length of the AT9 construct), with the same scaling as 
the first exit of unbiased diffusion. 



Discussion. Previous bulk melting studies provided 
DNA stability data Ed, EH, 

on whose basis the relation 
between local sequence stability and coding properties 
of the associated genes was shown 0, E3- However, it 
is single molecule experiments that permit to study the 
dynamics of DNA denaturation and rcnaturation |l2j . 
We here derive a physical framework for the opening and 
closing fluctuations of intermittent DNA bubbles in an 
arbitrary sequence of bps using the position of the two 
bubble zipper forks as fundamental coordinates. By com- 
parison with previously unpublished FCS data we prove 
the predictive power of our model. As complementary 
approach based on the same (un)zipping rates, we in- 
troduced the stochastic Gillespie simulation, that pro- 
vides the time series of single bubble fluctuations. The 
time averages from the stochastic simulation agree well 
with the ensemble properties derived from the ME. By 
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C(NaCI) [M] 



FIG. 5: Mean correlation time versus salt for various T (top), 
and melting curves versus salt concentration and T (bottom). 



its computationally attractive formulation based on the 
waiting time the Gillespie approach allows to include ad- 
ditional effects such as protein binding dynamics, or to 
consider longer chains and multibubble states. For a long 
homopolymer our model is analytically tractable (5l ll9l| . 

We used recent DNA stability data from based on 
separation of hydrogen bond and stacking energies, a dis- 
tinct feature being the low stacking in a TA/AT stack, 
translating into a pronounced instability of the TATA 
motif, as shown for the T7 promoter sequence. The rel- 
evance of stacking interactions is also shown in the inset 
in Fig. [3] exhibiting pronouncedly different melting be- 
haviour despite identical AT and GC contents for the 
constructs in [T3. |28| . Regarding the biological relevance 
of TATA, from our analysis it may be speculated that it is 
not primarily the bubble lifetime (typically shorter than 
the timescale of protein conformational changes) but the 
recurrence frequency of bubble events that triggers the 
initiation of transcription. Note that typical binding en- 
ergies of TATA binding proteins exceed the free energy 
to break up TATA, while both energies are comparable 
for a random sequence of the same length. 

Given the high sensitivity of bubble dynamics to the 
stability parameters it should be of interest to employ 
FCS on designed DNA constructs to more accurately ob- 
tain stability data for different DNA structures and to 
calibrate the (un)zipping rates. 

We thank G. Altan-Bonnet and A. Libchaber for shar- 
ing the data for Fig.El M. Frank-Kamenetskii for discus- 
sion and access to the new stability data prior to publica- 
tion, and M. A. Lomholt and K. Splitorff for discussion. 
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